In [1]:
from bertviz import head_view, model_view
from transformers import BertTokenizer, BertModel

model_version = 'bert-base-uncased'
model = BertModel.from_pretrained(model_version, output_attentions=True)
tokenizer = BertTokenizer.from_pretrained(model_version)
sentence_a = "The CEO's name is Alice"
sentence_b = "The CEO's name is Bob"
inputs = tokenizer.encode_plus(sentence_a, sentence_b, return_tensors='pt')
input_ids = inputs['input_ids']
token_type_ids = inputs['token_type_ids']
attention = model(input_ids, token_type_ids=token_type_ids)[-1]
sentence_b_start = token_type_ids[0].tolist().index(1)
input_id_list = input_ids[0].tolist() # Batch index 0
tokens = tokenizer.convert_ids_to_tokens(input_id_list)
/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_auth.py:94: UserWarning: 
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
  warnings.warn(
BertSdpaSelfAttention is used but `torch.nn.functional.scaled_dot_product_attention` does not support non-absolute `position_embedding_type` or `output_attentions=True` or `head_mask`. Falling back to the manual attention implementation, but specifying the manual implementation will be required from Transformers version v5.0.0 onwards. This warning can be removed using the argument `attn_implementation="eager"` when loading the model.

How to Use¶

➡️ Hover over any token on the left or right side to view the attention from or to that token.
➡️ Double-click a colored tile at the top to focus on a single attention head.
➡️ Click a colored tile to show or hide the corresponding attention head.
➡️ Select a layer from the Layer drop-down to switch between model layers (starting at zero).

In [2]:
head_view(attention, tokens, sentence_b_start)
Layer: Attention:

Model View¶

The model view shows an overview of attention across the entire model.

Each cell represents the attention weights for a specific head, organized by layer (row) and head (column). The lines in each cell indicate the attention from one token (left) to another (right), with line thickness reflecting the attention value (ranging from 0 to 1).

How to Use¶

➡️ Click on any cell to view detailed attention patterns for the selected attention head (or to deselect it).
➡️ Hover over any token in the detailed view to highlight the attention from that token.

In [3]:
model_view(attention, tokens, sentence_b_start)
Attention:

Neuron View¶

The neuron view shows the intermediate representations (e.g., query and key vectors) used to compute attention.

In the collapsed view (default state), the lines represent the attention from each token (left) to every other token (right). In the expanded view, the tool reveals the sequence of computations that produce these attention weights.

How to Use¶

➡️ Hover over any token on the left side to highlight attention from that token.
➡️ Click the plus icon (visible when hovering) to display query vectors, key vectors, and other intermediate values used to compute attention. Each color band represents a neuron value, with color intensity reflecting magnitude and hue indicating sign (blue = positive, orange = negative).
➡️ In the expanded view, hover over any token to explore the corresponding attention computations.
➡️ Use the Layer or Head drop-downs to select a different model layer or attention head (starting at zero).

In [4]:
from bertviz.transformers_neuron_view import BertModel, BertTokenizer
from bertviz.neuron_view import show

model_type = 'bert'
model_version = 'bert-base-uncased'
model = BertModel.from_pretrained(model_version, output_attentions=True)
tokenizer = BertTokenizer.from_pretrained(model_version, do_lower_case=True)
show(model, model_type, tokenizer, sentence_a, sentence_b, layer=4, head=3)
/usr/local/lib/python3.11/dist-packages/bertviz/transformers_neuron_view/modeling_utils.py:482: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  state_dict = torch.load(resolved_archive_file, map_location='cpu')
Layer: Head: Attention:
In [ ]:
!jupyter nbconvert --execute --to html '/content/drive/MyDrive/Colab Notebooks/tutorial.ipynb'
[NbConvertApp] Converting notebook /content/drive/MyDrive/Colab Notebooks/tutorial.ipynb to html
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
[NbConvertApp] ERROR | unhandled iopub msg: colab_request
[NbConvertApp] ERROR | unhandled iopub msg: colab_request
In [1]:
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
In [5]: